Hybrid language models for speech transcription
نویسندگان
چکیده
This paper analyzes the use of hybrid language models for automatic speech transcription. The goal is to later use such an approach as a support for helping communication with deaf people, and to run it on an embedded decoder on a portable device, which introduces constraints on the model size. The main linguistic units considered for this task are the words and the syllables. Various lexicon sizes are studied by setting thresholds on the word occurrence frequencies in the training data, the less frequent words being therefore syllabified. A recognizer using this kind of language model can output between 62% and 96% of words (with respect to the thresholds on the word occurrence frequencies; the other recognized lexical units are syllables). By setting different thresholds on the confidence measures associated to the recognized words, the most reliable word hypotheses can be identified, and they have correct recognition rates between 70% and 92%.
منابع مشابه
Open vocabulary speech recognition with flat hybrid models
Today’s speech recognition systems are able to recognize arbitrary sentences over a large but finite vocabulary. However, many important speech recognition tasks feature an open, constantly changing vocabulary. (E.g. broadcast news transcription, translation of political debates, etc. Ideally, a system designed for such open vocabulary tasks would be able to recognize arbitrary, even previously...
متن کاملContext dependent modelling approaches for hybrid speech recognizers
Speech recognition based on connectionist approaches is one of the most successful alternatives to widespread Gaussian systems. One of the main claims against hybrid recognizers is the increased complexity for context-dependent phone modeling, which is a key aspect in medium to large size vocabulary tasks. In this paper, we investigate the use of context-dependent triphone models in a connectio...
متن کاملA Transcription Scheme for Languages Employing the Arabic Script Motivated by Speech Processing Application
This paper offers a transcription system for Persian, the target language in the Transonics project, a speech-to-speech translation system developed as a part of the DARPA Babylon program (The DARPA Babylon Program; Narayanan, 2003). In this paper, we discuss transcription systems needed for automated spoken language processing applications in Persian that uses the Arabic script for writing. Th...
متن کاملA Transcription Scheme For Languages Employing The Arabic Script Motivated By Speech Processing Applications
This paper offers a transcription system for Persian, the target language in the Transonics project, a speech-to-speech translation system developed as a part of the DARPA Babylon program (The DARPA Babylon Program; Narayanan, 2003). In this paper, we discuss transcription systems needed for automated spoken language processing applications in Persian that uses the Arabic script for writing. Th...
متن کاملAn RNN-based Music Language Model for Improving Automatic Music Transcription
In this paper, we investigate the use of Music Language Models (MLMs) for improving AutomaticMusic Transcription performance. The MLMs are trained on sequences of symbolic polyphonic music from the Nottingham dataset. We train Recurrent Neural Network (RNN)-based models, as they are capable of capturing complex temporal structure present in symbolic music data. Similar to the function of langua...
متن کامل